Domain Adaptation with Randomized Expectation Maximization

نویسندگان

  • Twan van Laarhoven
  • Elena Marchiori
چکیده

Domain adaptation (DA) is the task of classifying an unlabeled dataset (target) using a labeled dataset (source) from a related domain. The majority of successful DA methods try to directly match the distributions of the source and target data by transforming the feature space. Despite their success, state of the art methods based on this approach are either involved or unable to directly scale to data with many features. This article shows that domain adaptation can be successfully performed by using a very simple randomized expectation maximization (EM) method. We consider two instances of the method, which involve logistic regression and support vector machine, respectively. The underlying assumption of the proposed method is the existence of a good single linear classifier for both source and target domain. The potential limitations of this assumption are alleviated by the flexibility of the method, which can directly incorporate deep features extracted from a pre-trained deep neural network. The resulting algorithm is strikingly easy to implement and apply. We test its performance on 36 real-life adaptation tasks over text and image data with diverse characteristics. The method achieves state-of-the-art results, competitive with those of involved end-to-end deep transfer-learning methods. Source code is available at http://github.com/twanvl/adrem.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Domain Adaptation with Active Learning for Word Sense Disambiguation

When a word sense disambiguation (WSD) system is trained on one domain but applied to a different domain, a drop in accuracy is frequently observed. This highlights the importance of domain adaptation for word sense disambiguation. In this paper, we first show that an active learning approach can be successfully used to perform domain adaptation of WSD systems. Then, by using the predominant se...

متن کامل

Learning Latent Word Representations for Domain Adaptation using Supervised Word Clustering

Domain adaptation has been popularly studied on exploiting labeled information from a source domain to learn a prediction model in a target domain. In this paper, we develop a novel representation learning approach to address domain adaptation for text classification with automatically induced discriminative latent features, which are generalizable across domains while informative to the predic...

متن کامل

Mixture of latent words language models for domain adaptation

This paper introduces a novel language model (LM) adaptation method based on mixture of latent word language models (LWLMs). LMs are often constructed as mixture of n-gram models, whose mixture weights are optimized using target domain data. However, n-gram mixture modeling is not flexible enough for domain adaptation because model merger is conducted on the observed word space. Since the words...

متن کامل

Improved Approximations for Non-monotone Submodular Maximization with Knapsack Constraints

Submodular maximization generalizes many fundamental problems in discrete optimization,including Max-Cut in directed/undirected graphs, maximum coverage, maximum facility loca-tion and marketing over social networks.In this paper we consider the problem of maximizing any submodular function subject to dknapsack constraints, where d is a fixed constant. We establish a strong ...

متن کامل

Domain Adaptation for Statistical Classifiers

The most basic assumption used in statistical learning theory is that training data and test data are drawn from the same underlying distribution. Unfortunately, in many applications, the “in-domain” test data is drawn from a distribution that is related, but not identical, to the “out-of-domain” distribution of the training data. We consider the common case in which labeled out-of-domain data ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018